|
In computer science, Union Find is an algorithm for doing certain operations on sets. This page is about proof of ''O''(log *''n'') amortized time 〔Raimund Seidel, Micha Sharir. "Top-down analysis of path compression", SIAM J. Computing 34(3):515–525, 2005〕 of Union Find〔Robert E. Tarjan and Jan van Leeuwen. Worst-case analysis of set union algorithms. Journal of the ACM, 31(2):245–281, 1984.〕 Statement: If ''m'' operations, either Union or Find, are applied to ''n'' elements, the total run time is ''O''(''m'' log *''n''), where log * is the iterated logarithm. ==Proof== Lemma 1: As the find function follows the path along to the root, the rank of node it encounters is increasing. :Proof: claim that as Find and Union operations are applied to the data set, this fact remains true over time. Initially when each node is the root of its own tree, it's trivially true. The only case when the rank of a node might be changed is when the Union by Rank operation is applied. In this case, a tree with smaller rank will be attached to a tree with greater rank, rather than vice versa. And during the find operation, all nodes visited along the path will be attached to the root, which has larger rank than its children, so this operation won't change this fact either. Lemma 2: A node ''u'' which is root of a subtree with rank ''r'' has at least 2''r'' nodes. :Proof: Initially when each node is the root of its own tree, it's trivially true. Assume that a node ''u'' with rank ''r'' has at least 2''r'' nodes. Then when two tree with rank ''r'' Unions by Rank and form a tree with rank ''r'' + 1, the new node has at least 2''r'' + 2''r'' = 2''r'' + 1 nodes. Lemma 3: The maximum number of nodes of rank ''r'' is at most ''n''/2''r''. :Proof: From lemma 2, we know that a node ''u'' which is root of a subtree with rank ''r'' has at least 2''r'' nodes. We will get the maximum number of nodes of rank ''r'' when each node with rank ''r'' is the root of a tree that has exactly 2''r'' nodes. In this case, the number of nodes of rank ''r'' is ''n''/2''r'' For convenience, we define "bucket" here: a bucket is a set that contains vertices with particular ranks. We create some buckets and put vertices into the buckets according to their ranks inductively. That is, vertices with rank 0 go into the zeroth bucket, vertices with rank 1 go into the first bucket, vertices with ranks 2 and 3 go into the second bucket. If the Bth bucket contains vertices with ranks from interval (2''r'' − 1 ) = (R - 1 ) then the (B+1)st bucket will contain vertices with ranks from interval (2''R'' − 1 ). We can make two observations about the buckets. # The total number of buckets is at most log *''n'' #:Proof: When we go from one bucket to the next, we add one more two to the power, that is, the next bucket to (2''B'' − 1 ) will be (22''B'' − 1 ) #The maximum number of elements in bucket (2''B'' – 1 ) is at most 2''n''/2''B'' #:Proof: The maximum number of elements in bucket (2''B'' – 1 ) is at most ''n''/2''B'' + ''n''/2''B''+1 + ''n''/2''B''+2 + … + ''n''/22^''B'' – 1 ≤ 2''n''/2''B'' Let ''F'' represent the list of "find" operations performed, and let Then the total cost of ''m'' finds is ''T'' = ''T''1 + ''T''2 + ''T''3 Since each find operation makes exactly one traversal that leads to a root, we have ''T''1 = ''O''(''m''). Also, from the bound above on the number of buckets, we have ''T''2 = ''O''(''m''log *''n''). For T3, suppose we are traversing from ''u'' to ''v'', where ''u'' and ''v'' have rank in the bucket (2''B'' − 1 ). From lemma 1, we know that the number of times we traversed a link (''u'',''v'') where ''u'' and ''v'' were in the same bucket is at most 2''B'' − 1 − ''B'', which is at most 2''B''. Therefore, From Observations 1 and 2, we can conclude that Therefore, ''T'' = ''T''1 + ''T''2 + ''T''3 = ''O''(''m'' log *''n''). 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Proof of O(log*n) time complexity of union–find」の詳細全文を読む スポンサード リンク
|